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Abstract. Different types of data skew can result in load imbalance in 
the context of parallel joins under the shared nothing architecture. We 
f**~} . study one important type of skew, join product skew (JPS). A static 

£SJ ' approach based on frequency classes is proposed which takes for granted 

the data distribution of join attribute values. It comes from the obser- 
vation that the join selectivity can be expressed as a sum of products of 
frequencies of the join attribute values. As a consequence, an appropriate 
assignment of join sub-tasks, that takes into consideration the magnitude 
of the frequency products can alleviate the join product skew. Motivated 
by the aforementioned remark, we propose an algorithm, called Handling 
rC\ Join Product Skew (HJPS), to handle join product skew. 
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CN ' 1 Introduction 

> 

rf\ _ The limited potentials of centralized database systems in terms of the storage 

f*"*. ■ and the process of large volumes of data has led to the advent of parallel database 

*/") \ management systems (PDBMS) that adopt the shared-nothing architecture. Ac- 

lf) • cording to this architecture, each computational node (database processor) has 

^^ | its own memory and CPU and independently accesses its local disks while it 

is provided with the ability to perform locally relational operations. By defini- 
tion, the aforementioned architecture favors the deployment of data intensive 
scale computing applications |13| by reducing the complexity of the underlying 
infrastructure and the overall cost as well. 

Within the scope of the parallel evaluation of the relational operators by 
C$ ' splitting them into many independent operators (partitioned parallelism), sort- 

merge join and hash-join constitute the main algorithms for the computation of 
the equijoin. Equijoin is a common special case of the join operation R XI S, 
where the join condition consists solely of equalities of the form R.X = S.Y (X 
and Y are assumed to be attributes of the relations R and S respectively) . Both 
algorithms are subject to parallel execution. However, the hash-based algorithm 
has prevailed since it has linear execution cost, and it performs better in the 
presence of data skew as well [3] . 
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The parallel hash-based join processing is separated into three phases. In the 
first phase, each relation is fully declustered horizontally across the database 
processors by applying a partition function on the declustcring attribute, which 
in general is different from the join attribute. Next, at the redistribution phase, 
each database processor applies a common hash function h on the join attribute 
value for its local fragments of relations R and S. The hash function h ships any 
tuple belonging to either relation R or S with join attribute value bi to the h(bi)- 
th database processor. At the end of the redistribution process both relations 
are fully partitioned into disjoint fragments. Lastly, each database processor p 
performs locally with the most cost-effective way an cquijoin operation between 
its fragments of relations R and S, denoted by R p and S p respectively. The 
joined tuples may be kept locally in each database processor instead of being 
merged with other output tuples into a single stream. 

Skewness, perceived as the variance in the response times of the database 
processors involved in the previously described computation, is identified as one 
of the major factors that affects the effectiveness of the hash-based parallel join 
[7J. [5] defines four types of the data skew effect: Tuple placement skew, selec- 
tivity skew, redistribution skew and join product skew. Query load balancing in 
terms of the join operation is very sensitive to the existence of the redistribution 
skew and/or the join product skew. Redistribution skew can be observed after 
the end of the redistribution phase. It happens when at least one database pro- 
cessor has received large number of tuples belonging to a specific relation, say R, 
in comparison to the other processors after the completion of the redistribution 
phase. This imbalance in the number of redistributed tuples is due to the exis- 
tence of naturally skewed values in the join attribute. Redistribution skew can 
be experienced in a subset of database processors. It may also concern both the 
relations R and S (double redistribution skew). Join product skew occurs when 
there is an imbalance in the number of join tuples produced by each database 
processor. [5] points the impact of this type of skewness to the response time 
of the join query. Especially, join product skew deteriorates the performance of 
subsequent join operation since this type of data skew is propagated into the 
query tree. 

In this paper we address the issue of join product skew. Various techniques 
and algorithms have been proposed in the literature to handle this type of skew 
(PQ' IH> El' 0' 0> [12] )• We introduce the notion of frequency classes, whose 
definition is based on the product of frequencies of the join attribute values. 
Under this perspective we examine the cases of homogeneous and heterogeneous 
input relations. 

We also propose a new static algorithm, called HJPS (Handling Join Prod- 
uct Skew) to improve the performance of the parallel joins in the presence of 
this specific type of skewness. The algorithm is based on the intuition that join 
product skew comes into play when the produced tuples associated with a spe- 
cific value overbalance the workload of a processor. HJPS algorithm constitutes 
a refinement of the PRPD algorithm [TT] in the sense that the exact number 
of the needed processors is defined for each skewed value instead of duplicat- 
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ing or redistributing the tuples across all the database processors. Additionally, 
HJPS is advantageous in the case of having join product skew without having 
redistribution skew. 

The rest of this paper is organized as follows. Section [2] discusses the related 
work. In section [3] we illustrate the notion of division of join attribute values 
into classes of frequencies by means of two generic cases. In section @] an algo- 
rithm that helps in reducing join product skew effect is proposed and section [5] 
concludes the paper. 



2 Related Work 



The achievement of load balancing in the presence of redistribution and join 
product skew is related to the development of static and dynamic algorithms. 
In static algorithms it is assumed that adequate information on skewed data 
is known before the application of the algorithm. pQ , [3] and [TT[ expose static 
algorithms. On the contrary, 0, [B] and [12] propose techniques and algorithms 
according to which data skew is detected and encountered dynamically at run 
time. 

[2] , |12j address the issue of the join product skew following a dynamic ap- 
proach. A dynamic parallel join algorithm that employs a two-phase scheduling 
procedure is proposed in [12] , The authors of [2] present an hybrid frequency- 
adaptive algorithm which dynamically combines histogram-based balancing with 
standard hashing methods. The main idea is that the processing of each sub- 
relation, stored in a processor, depends on the join attribute value frequencies 
which are determined by its volume and the hashing distribution. 

P], [D an d E] deal with the join product skew in a static manner. In [IT] , 
authors addresses the issue of the redistribution skew by proposing the PRPD 
algorithm. However, except for redistribution skew, their approach handles the 
join product skew that results from the former. In PRPD algorithm, the redis- 
tribution phase of the hash-join has been modified to some degree. Especially, 
for the equijoin operation R\ m R 2 , the tuples of each sub- relation of R\ with 
skewed join attribute values occurring in R\ are kept locally in the database pro- 
cessor. On the other hand, the tuples that have skewed values happening in R 2 
are broadcast to all the database processor. The remaining tuples of sub-relation 
are hash redistributed. The tuples of each sub-relation of R 2 are treated in the 
respective way. The algorithm captures efficiently the case where some values 
are skewed in both relations. Using the notion of the splitting values stored in a 
split vector, virtual processor partitioning [3] assigns multiple range partitions 
instead of one to each processor. Finally, authors in [T] assign a work weight 
function to each join attribute value in order to generate partitions of nearly 
equal weight. 

Finally, OJSO algorithm [TU] handles data skew effect in an outer join, which 
is a variant of the equijoin operation. 
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3 Two Motivating Examples 

We will assume the simple case of a binary join operation R\(A, B) N R,2(B, C), 
in which the join predicate is of the form R±.B = R2.B. The m discrete values 
61, &2j • • • j b m define the domain D of the join attribute B. Let fi(bj) denote the 
relative frequency of join attribute value bj in relation Ri. Given the relative 
frequencies of the join attribute values b\, &2, • ■ • , b m , the join selectivity of R\ n 
i?2 is equal to [5] 

2 

Since /i = L, 1 ^ 2 and the size of the result set of the cross product R\ x R2 is 
equal to the product \Ri | I-R2I, the cardinality of the result set associated with the 
join operation Ri n R2 is determined by the magnitude of the join selectivity. 

By extending the previous analysis, the join selectivity [i can be considered 
as the probability of the event that two randomly picked tuples, belonging to the 
relations R\ and R2 respectively, join on the same join attribute value. Based 
on this observation an analytical formula concerning the size of the result set of 
the chain join (which is one of the most common form of the join operation) is 
proven. Especially we state that the join selectivity of the chain join, denoted 
by R = M* =1 Ri(Ai-i 7 Ai), is equal to the product of the selectivities Hts+i of 
the constituent binary operation Ri{Ai-\, Ai) x R i+ i(A i} A i+i ) under a certain 
condition of independence. In our notation, we omit to include attributes in the 
relations that do not participate in the join process. Formally, we have the fol- 
lowing 

Lemma Given that the values of the join attributes Ai in a chain join of k 
relations are independent of each other, the overall join selectivity of the chain 
join, denoted by [A, is equal to the product of the selectivities of the constituent 
binary join operations, i.e., fi = Yii=i A*i,i+i- 

Proof: We define a pair of random variables (Xi,Yi) for every relation R4, 
where i = 2, . . . ,k — 1. Specifically, the random variable X^ corresponds to the 
join attribute Ri-Ai and it is defined as the function X^(i) : Qi — > Nx ; , where fii is 
the set of the tuples in the relation Ri. Nx ; stands for the set {0, 1, ... , Z?^ — 1}, 
where Z? J 4 i is the domain of the join attribute Ai. In other words, Nx ; defines an 
enumeration of the values of the join attribute Ai , in such a way that there is a 
one-to-one correspondence between the values of the set D^ and Nx ; ■ Similarly, 
the random variable Y^(i) : Qi — > Ny t corresponds to the join attribute Ai+i, 
where Ny ; represents the set {0, 1, ... , \Da {+1 \ — 1}. 

As for the edge relations R\ and Rk, only the random variables Yi and X^ 
are defined, since the attributes Ri-Aq and Rk-Ak do not participate in the join 
process. 
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Let TZ denote the event of the join process. Then we have that 
p(K) = p(Yi = X 2 A Y 2 = X 3 A . . . A Y k ^ = X k ) 

By assumption, the random variables are independent of each other. Thus, 

it is valid to say that 

fe-i 



P (K) = l[p(Y l =X. l+1 ) 



Moreover, p(Yj = Xj+i) represents the probability of the event that two ran- 
domly picked tuples from relations Ri and Ri+i agree on their values of the join 
attribute At. Since it holds that p(Yj = Xj+i) = /j,.,_i, the lemma follows. D 
As a direct consequence of the previous lemma, the cardinality of the result 
set associated with the join operation R = m|L 1 Ri(Ai-i,Ai) is given by the 
formula 

fc— 1 k 

1=1 3=1 

3.1 Homogeneous Input Relations 

Firstly, we examine the natural join of two homogeneous relations R\(A, B) x 
i?2(B, C) in the context of the join product skew effect. In the case of the homo- 
geneous relations the distribution of the join attribute values hi is the same for 
both input relations R\ and R 2 - That is, there exists a distribution / such that 
fi(b) — /2(b) = fib) for any b g D. In this setting, the distribution / is skewed 
when there are join attribute values bi, bj £ D such that /(&*) S> f{bj). 

The join attribute values with the same relative frequency ff. defines the 
frequency class C k — {b e D \ fib) = f k }. 

Thus, the domain D of the join attribute B is disjointly separated into classes 
of different frequencies. This separation can be represented with a two level tree, 
called frequency tree. The nodes of the first level correspond to classes of different 
frequencies. The k th node of the first level is labeled with C k . The descendant 
leaves of the labeled node C k correspond to the join attributes belonging to 
class Ck- Each leaf is labeled with the value of one of the join attributes of 
the class corresponding to the parent node. The following picture depicts the 
structure of a simple frequency tree for join operation R-± N R 2 assuming that 
D = {61, . . . , be} is separated into four frequency classes C\, . . . , C4. 



root 



C\ C*2 C3 C4 

62 b\ be b 3 b 5 fr 4 
Fig. 1. The frequency tree for R\ n R2 
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The number of produced joined tuples for a given class Ck is equal to 
| Ck | fk I Ri 1 1 Ra 1 since fk\Ri\ tuples of relation i?i matches with fk \ R2 | tuples of re- 
lation i?2 on any join attribute value b E Ck- Let N be the number of the database 
processors participating in the computation of the join operation. Since only the 
join product skew effect is considered, the workload associated with each node is 
determined by the size of the partial result set that is computed locally. In order 
the workload of the join operation to be evenly apportioned on the N database 

processors, each node should produce approximately ( ' =1 jv ) \Ri | I-R2 1 num- 
ber of joined tuples, where K denotes the number of frequency classes. In terms 
of the frequency classes, this is equivalent to an appropriate assignment of cithcr 
cntire or subset of frequency class (es) to each database processors in order to 
achieve the nearly even distribution of the workload. This assignment can be 
represented by the selection of some internal nodes and leaves in the frequency 
tree. By construction, the selection of an internal node in the frequency tree 
amounts to the exclusive assignment of the corresponding frequency class to 
some database processor. Thus, this database processor will join tuples from the 
relations i?i and R2 whose join attribute value belongs to the selected class. Fi- 
nally, to guarantee the integrity of the final result set, the sequence of selections 
must span all the leaves of the frequency tree. 



3.2 Heterogeneous Input Relations 

We extend the previous analysis in the case of heterogenous input relations. The 
join attribute values are distributed to the input relations R\ (A, B) and R2(B, C) 
according to the data distributions f\ and /2, respectively. In general, it holds 
that the relative frequencies of any join attribute value b £ D are different in the 
relations R\ and R2, i.e., fi(b) ^ /2(b) for any b E D. The above are depicted 
in table [1] 

The number of joined tuples corresponding to the join attribute value b E D 
is rendered by the product fi(b) /2(b) ■ Thus, the join product skew happens when 
fi{bi).f2(bi) ^> f2(bj)f2(bj) for some bi,bj E D. This means that the workload 
of the join process for the database processor, to which the tuples with join 
attribute value equal to bi have been shipped at the redistribution phase, will 
be disproportional compared with the respective workload of another database 
processor. Similarly to section I3TT1 the classes Ck = {b E D \ fi(b)f2(b) = fk} 
disjointly partition the join attribute values. 

Alternatively, it is possible the definition of classes of ranges of frequencies 
according to the schema Ck = {b E D fk-\ < fi{x)f2{x) < fk} (range 
partitioning in the frequency level). 

The "primary- key-to-foreign- key" join consists a special case of heterogeneity 
where in one of the two relation, say R\ , two different tuples always have different 
values in the attribute B. This attribute is called primary key and its each value 
b E D uniquely identifies a tuple in relation R\, As to relation R2, attribute 
B, called foreign key, matches the primary key of the referenced relation R\. 
In this setting, which is very common in practice, we have that fi(bi) = — for 
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any bi G D, and in general /a(&j) ^ — with fzibi) > 0. The join product skew 
happens when f2(pi) ^ fz(bj) for some bi,bj G D, since /i(6j) = fi(bj). Thus, 
the separation of the join attribute values into disjoint frequency classes can be 
defined with respect to the data distribution f%, i.e., Ck — {x G D J2{x) = /&}. 



Join Attribute Values 


Rx 


R 2 


6i 


Mbi) 


Mbi) 








b m 


fl(bm) 


Hbm) 



Table 1. Relative frequencies of the join attribute values. 



4 Algorithm HJPS 

In this section, we propose an algorithm, called HJPS, that alleviates the join 
product skew effect. The algorithm deals with the case of the binary join oper- 
ation R(A, B) ix S(B, C) in which the join predicate is R.B = S.B. 

Let D = {&i, 62, ..., b m } be the domain of values associated with the join 
attribute B. We denote by \Rb t \ (\Sb t \) the number of tuples of the relation 
R (respectively S) with join attribute value equal to bi, where bi G D. The 
algorithm considers that the quantities l-R&J, \Sbi\ for every bi G D are known 
in advance by either previously collected or sampled statistics. We also denote 
by n the number of the database processors. In our setting, all the database 
processors are supposed to have identical configuration. 

As it has been mentioned earlier, the number of the needed computations 
for the evaluation of the join operation, that identifies the total processing cost 
{TPC), is determined by the sum of products of the number of tuples in both 
relations that have the same join attribute values. This means that TPC is 
expressed by the equation 



TPC = Y^ \ R b 



\S, 



biED 



In the context of the parallel execution of the join operator, the ideal workload 
assigned to each processor, denoted by pwl, is defined as the approximate number 
of the joined tuples that it should produce in order not to experience the join 
product skew effect. Obviously, it holds that that pwl = TPC/n. 

HJPS determines whether or not a join attribute value bi G D is skewed by 
the number of the processors dedicated to the production of the joined tuples 
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corresponding to this value. To be more specific, the quotient of the division of 
the number of joined tuples associated with the join attribute value bi (which 
is equal to \Rb t \ * I'SfaJ) by pwl gives the number of the processors needed to 
handle this attribute value. In the case that the result of the division, denoted 
by vwlbi , exceeds the value of two, the algorithm considers the join attribute 
value as skewed. The latter is inserted into a set of values, denoted by SK. 

Let SK = {b ai ,b a2 ,b a3 , —,b a ,} be the set of the skewed values. The algorithm 
iterates over the set SK. In particular, for the value b ai , suppose that the number 
of the needed processors is equal to vwlb a ■ The algorithm takes a decision based 
on the number of tuples with join attribute value b ai in relations R and S. If 
\Rb a I > \Sb a |, the tuples of the relation R are redistributed to the first vwlb a 
processors while all the tuples from the second relation are duplicated to all of 
the vwlb a processors. In order to decide which of the vwlb a processors is going 
to receive a tuple of the relation R with join attribute value b ai , the algorithm 
applies a hash function on a set of attributes. On the contrary, if it holds that 
\Rb a I < \Sb a |, all the tuples from the relation R with join attribute value equal 
to b ai arc duplicated to all of the vwlb a processors while the tuples of the relation 
S arc distributed to all of the vwlb a processors according to a hash function. 
The same procedure takes place for the rest skewed values. The remaining tuples 
are redistributed to the rest processors according to a hash function on the join 
attribute. A Pseudocode of the algorithm is given below. 



5 Conclusion and Future Work 



We address the problem of join product skew in the context of the PDBMS. 
In our analysis, the apriori knowledge of the distribution of the join attribute 
values has been taken for granted. We concentrated on the case of partitioned 
parallelism, according to which the join operator to be parallelized is split into 
many independent operators each working on a part of data. We introduced the 
notion of frequency classes and we examined its application in the general cases 
of homogeneous and heterogeneous input relations. Furthermore, an heuristic 
algorithmic called HJPS is proposed to handle join product skew. The proposed 
algorithm identifies the skew elements and assigns a specific number of processors 
to each of them. Given a skewed join attribute value, the number of dedicated 
processors is determined by the process cost for computing the join for this 
attribute value, and by the workload that a processor can afford. 

We are looking at generalizing our analysis with frequency classes at multiway 
joins. In this direction we have proven the lemma of section [3] which is about the 
chain join of k relations. Furthermore, other types of multiway join operations, 
e.g., star join, cyclic join, are going to be studied in the perspective of the 
data skew effect and under the context of frequency classes. Finally, in a future 
work we will examine the case of multiway joins supposing that no statistical 
information about the distribution of the join attribute values is given in advance. 
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Algorithm HJPS (* Handling Join Product Skew *) 

Input: t ri tuples of relations R and t Tj tuples of relations S, N number of 

processors. 

Output: correspondence of tuple to processor 

Consider the join attribute value is the set: 

D = {61,62, ...,6m} 

(* compute all frequencies for every join attribute value inD *) 

for j : — 61 to b m do 

calculate the frequencies /_r . , fs • ; 
TPC = J2 b -eD \ R bi I * \ S W I (*TCP the total process cost*) 
pwl = TPC/N 

(*pwl the process cost of each processor*) 
vwl bi - \R bi \ * \S bi \; 

^vwhi the process cost for each join attribute value 6j) 
pn bi = vwlbi /pwl; 

(*P?iiji ideal number of processors for the join attribute value 6;*) 
if (pnbi >~ 2) consider 6j a skewed value; 

Let SK = {b ai , ba 2 , b a3 , ■■■, b ai } be the set of skewed values 
for j : = di to 01 do 
if \R bai I > \S bai 

distribute every t ri to the next vn^ processors; 
(*for distribution use a hash function to a set of attributes*) 

send every t Si to the next vn bi processors; 
else 

distribute every t Si to the next im^ processors; 

send every t Ti to the next vn bi processors; 
assign rest tuples from both relations to the rest processors; 
(*for assignment HJPS applies a hash function to the join attribute *) 
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